Term Extraction and Mining of Term Relations from Unrestricted Texts in the Financial Domain

نویسندگان

  • Feiyu Xu
  • Daniela Kurz
  • Jakub Piskorski
  • Sven Schmeier
چکیده

In this paper, we present an unsupervised hybrid textmining approach to automatic acquisition of domain relevant terms and their relations. We deploy the TFIDFbased term classification method to acquire domain relevant terms. Further, we apply two strategies in order to learn lexico-syntatic patterns which indicate paradigmatic and domain relevant syntagmatic relations between the extracted terms. The first one uses GermaNet, while the second is based on different collocation acquisition methods to deal with free-word order languages like German. This domain-adaptive method yields good results even when trained on relative small training corpora. Therefore, it can be applied for solving information extraction and retrieval tasks within a realworld business information system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction d'arguments de relations n-aires dans les textes guidée par une RTO de domaine. (Extraction of arguments in N-ary relations in texts guided by a domain OTR)

Today, a huge amount of data is made available to the research community through several web-based libraries. Enhancing data collected from scientific documents is a major challenge in order to analyze and reuse efficiently domain knowledge. To be enhanced, data need to be extracted from documents and structured in a common representation using a controlled vocabulary as in ontologies. Our rese...

متن کامل

Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch

In this research, we evaluate different approaches for the automatic extraction of hypernym relations from English and Dutch technical text. The detected hypernym relations should enable us to semantically structure automatically obtained term lists from domainand userspecific data. We investigated three different hypernymy extraction approaches for Dutch and English: a lexico-syntactic pattern...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods

As the amount of electronic documents (corpora, dictionaries, newspapers, newswires, etc.) becomes more and more important and diversified, there is a need to extract information automatically from these texts. In order to extract terms and relations between terms, two methods can be used. The first method is the unsupervised approach, which requires a term extraction module and few predefined ...

متن کامل

Long-term stability analysis of goaf area in longwall mining using minimum potential energy theory

Estimation of the height of caved and fractured zones above a longwall panel along with the stability conditions of the goaf area are very crucial to determine the abutment stresses, ground subsidence, and face support as well as designing the surrounding gates and intervening pillars. In this work, the height of caving-fracturing zone above the mined panel is considered as the height of destre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002